Association mapping and significance estimation via the coalescent.

نویسندگان

  • Gad Kimmel
  • Richard M Karp
  • Michael I Jordan
  • Eran Halperin
چکیده

The central questions asked in whole-genome association studies are how to locate associated regions in the genome and how to estimate the significance of these findings. Researchers usually do this by testing each SNP separately for association and then applying a suitable correction for multiple-hypothesis testing. However, SNPs are correlated by the unobserved genealogy of the population, and a more powerful statistical methodology would attempt to take this genealogy into account. Leveraging the genealogy in association studies is challenging, however, because the inference of the genealogy from the genotypes is a computationally intensive task, in particular when recombination is modeled, as in ancestral recombination graphs. Furthermore, if large numbers of genealogies are imputed from the genotypes, the power of the study might decrease if these imputed genealogies create an additional multiple-hypothesis testing burden. Indeed, we show in this paper that several existing methods that aim to address this problem suffer either from low power or from a very high false-positive rate; their performance is generally not better than the standard approach of separate testing of SNPs. We suggest a new genealogy-based approach, CAMP (coalescent-based association mapping), that takes into account the trade-off between the complexity of the genealogy and the power lost due to the additional multiple hypotheses. Our experiments show that CAMP yields a significant increase in power relative to that of previous methods and that it can more accurately locate the associated region.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Coalescent-based association mapping and fine mapping of complex trait loci.

We outline a general coalescent framework for using genotype data in linkage disequilibrium-based mapping studies. Our approach unifies two main goals of gene mapping that have generally been treated separately in the past: detecting association (i.e., significance testing) and estimating the location of the causative variation. To tackle the problem, we separate the inference into two stages. ...

متن کامل

Inference in Kingman's Coalescent with Particle Markov Chain Monte Carlo Method

We propose a new algorithm to do posterior sampling of Kingman’s coalescent, based upon the Particle Markov Chain Monte Carlo methodology. Specifically, the algorithm is an instantiation of the Particle Gibbs Sampling method, which alternately samples coalescent times conditioned on coalescent tree structures, and tree structures conditioned on coalescent times via the conditional Sequential Mo...

متن کامل

Linkage disequilibrium, haplotype evolution, and the coalescent

Linkage disequilibrium has become an important tool for fine-scale mapping. We argue that in order to understand the pattern of association between alleles at different loci, as well as DNA sequence polymorphism, it is useful first to consider the underlying genealogy of the chromosomes. The stochastic process known as the coalescent provides an extremely convenient way of modeling such genealo...

متن کامل

Linkage disequilibrium: what history has to tell us.

Linkage disequilibrium has become important in the context of gene mapping. We argue that to understand the pattern of association between alleles at different loci, and of DNA sequence polymorphism in general, it is useful first to consider the underlying genealogy of the chromosomes. The stochastic process known as the coalescent is a convenient way to model such genealogies, and in this pape...

متن کامل

Geostatistically estimation and mapping of forest stock in a natural unmanaged forest in the Caspian region of Iran

Estimation and mapping of forest resources are preconditions for management, planning and research. In this study, we applied kriging interpolation of geostatistics for estimation and mapping of forest stock at-tributes in a natural, uneven-aged, unmanaged forest in the Caspian region of northern Iran. The site of the study has an area of 516 ha and an elevation that ranges from 1100 to 1450 m ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • American journal of human genetics

دوره 83 6  شماره 

صفحات  -

تاریخ انتشار 2008